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Abstract —With the rise of Software Defined Networks (SDN), 
there is growing interest in dynamic and centralized traffic 
engineering, where decisions about forwarding paths are taken 
dynamically from a network-wide perspective. Frequent path re¬ 
configuration can significantly improve the network performance, 
but should be handled with care, so as to minimize disruptions 
that may occur during network updates. 

In this paper we introduce Time4, an approach that uses 
accurate time to coordinate network updates. Time4 is a powerful 
tool in softwarized environments, that can be used for various 
network update scenarios. Specifically, we characterize a set of 
update scenarios called flow swaps, for which Time4 is the optimal 
update approach, yielding less packet loss than existing update 
approaches. We define the lossless flow allocation problem, and 
formally show that in environments with frequent path allocation, 
scenarios that require simultaneous changes at multiple network 
devices are inevitable. 

We present the design, implementation, and evaluation of a 
TiME4-enabled OpenFlow prototype. The prototype is publicly 
available as open source. Our work includes an extension to 
the OpenFlow protocol that has been adopted by the Open 
Networking Foundation (ONF), and is now included in OpenFlow 
1.5. Our experimental results show the significant advantages 
of Time4 compared to other network update approaches, and 
demonstrate an SDN use case that is infeasible without Time4. 

Time is what keeps everything from happening at once 

- Ray Cummings 

1. Introduction 

A. It’s About Time 

The use of synchronized clocks was first introduced in 
the 19^^ century by the Great Western Railway company in 
Great Britain. Clock synchronization has significantly evolved 
since then, and is now a mature technology that is being 
used by various different applications, from mobile backhaul 
networks [3] to distributed databases [4]. 

The Precision Time Protocol (PTP), defined in the IEEE 
1588 standard [5], can synchronize clocks to a very high 
degree of accuracy, typically on the order of 1 microsecond [3], 
[6], [7]. PTP is a common and affordable feature in commodity 
switches. Notably, 9 out of the 13 SDN-capable switch sili¬ 
cons listed in the Open Networking Eoundation (ONE) SDN 
Product Directory [8] have native IEEE 1588 support [9]-[17]. 

^This report is an extended version of [1], which was accepted to IEEE INFO- 
COM T6, San Francisco, April 2016. A preliminary version of this report was published 
in arXiv [2] in May, 2015. 

* Yoram Moses is the Israel Poliak academic chair at Technion. 


In this work we introduce Time4, a generic tool for using 
time in SDN. One of the products of this work is a new 
feature that enables timed updates in OpenElow, and has been 
incorporated in OpenElow 1.5. Eurthermore, we present a 
class of update scenarios in which the use of accurate time 
is provably optimal, while existing update methods are sub- 
optimal. 

B. The Challenge of Dynamic Traffic Engineering in SDN 

Defining network routes dynamically, based on a complete 
view of the network, can significantly improve the network 
performance compared to the use of distributed routing pro¬ 
tocols. SDN and OpenElow [18], [19] have been leading 
trends in this context, but several other ongoing efforts offer 
similar concepts. The Interface to the Routing System (I2RS) 
working group [20], and the Eorwarding and Control Element 
Separation (EorCES) working group [21] are two examples of 
such ongoing efforts in the Internet Engineering Task Eorce 
(lETE). 

Centralized network updates, whether they are related to 
network topology, security policy, or other configuration at¬ 
tributes, often involve multiple network devices. Hence, up¬ 
dates must be performed in a way that strives to minimize 
temporary anomalies such as traffic loops, congestion, or 
disruptions, which may occur during transient states where 
the network has been partially updated. 

While SDN was originally considered in the context of 
campus networks [18] and data centers [22], it is now also 
being considered for Wide Area Networks (WANs) [23], [24], 
carrier networks, and mobile backhaul networks [25]. 

WAN and carrier-grade networks require a very low packet 
loss rate. Carrier-grade performance is often associated with 
the term five nines, representing an availability of 99.999%. 
Mobile backhaul networks require a Erame Loss Ratio (ELR) 
of no more than 10“^ for voice and video traffic, and no 
more than 10“^ for lower priority traffic [26]. Other types 
of carrier network applications, such as storage and financial 
trading require even lower loss rates [27], on the order of 
10 -^ 

Several recent works have explored the realm of dynamic 
path reconfiguration, with frequent updates on the order of 
minutes [23], [24], [28], enabled by SDN. Interestingly, for 
voice and video traffic, a frame loss ratio of up to 10“^ implies 
that service must not be disrupted for more than 6 milliseconds 
per minute. Hence, if path updates occur on a per-minute basis, 
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then transient disruptions must be limited to a short period of 
no more than a few milliseconds. 


C Timed Network Updates 

We explore the use of accurate time as a tool for performing 
coordinated network updates in a way that minimizes packet 
loss. Softwarized management can significantly benefit from 
using time for coordinating network-wide orchestration, and 
for enforcing a given order of events. We introduce Time4, 
which is an update approach that performs multiple changes 
at different switches at the same time. 

Example 1. Fig. 1 illustrates a fiow swapping scenario. In this 
scenario, the forwarding paths of two flows, fi and f^, need to 
be reconfigured, as illustrated in the figure. It is assumed that 
all links in the network have an identical capacity of I unit, 
and that both fi and /2 require a bandwidth of I unit. In the 
presence of accurate clocks, by scheduling Si and Ss to update 
their paths at the same time, there is no congestion during 
the update procedure, and the reconfiguration is smooth. As 
clocks will typically be reasonably well synchronized, albeit 
not perfectly synchronized, such a scheme will result in a very 
short period of congestion. 



Fig. 1: Flow Swapping—Flows need to convert from the 
“before” configuration to the “after”. 


In this paper we show that in a dynamic environment, where 
flows are frequently added, removed or rerouted, flow swaps 
are inevitable. A notable example of the importance of fiow 
swaps is a recently published work by Fox Networks [29], in 
which accurately timed fiow swaps are essential in the context 
of video switching. 

One of our key results is that simultaneous updates are the 
optimal approach in scenarios such as Example 1, whereas 
other update approaches may yield considerable packet loss, 
or incur higher resource overhead. Note that such packet 
loss can be reduced either by increasing the capacity of the 
communication links, or by increasing the buffer memories in 
the switches. We show that for a given amount of resources, 
Time4 yields lower packet loss than other approaches. 

Accuracy is a key requirement in Time4; since updates 
cannot be applied at the exact same instant at all switches, 
they are performed within a short time interval called the 
scheduling error. The experiments we present in Section IV 
show that the scheduling error in software switches is on the 
order of 1 millisecond. The TCAM-based hardware solution 
of [30] can execute scheduled events in existing switches with 
an accuracy on the order of 1 microsecond. 


Accurate time is a powerful abstraction for SDN program¬ 
mers, not only for fiow swaps, but also for timed consistent 
updates, as discussed by [31]. 

D. Related Work 

Time and synchronized clocks have been used in various 
distributed applications, from mobile backhaul networks [3] 
to distributed databases [4]. Time-of-day routing [32] routes 
traffic to different destinations based on the time-of-day. Path 
calendaring [33] can be used to configure network paths 
based on scheduled or foreseen traffic changes. The two latter 
examples are typically performed at a low rate and do not 
place demanding requirements on accuracy. 

Various network update approaches have been analyzed in 
the literature. A common approach is to use a sequence of 
configuration commands [28], [34]-[36], whereby the order of 
execution guarantees that no anomalies are caused in interme¬ 
diate states of the procedure. However, as observed by [28], in 
some update scenarios, known as deadlocks, there is no order 
that guarantees a consistent transition. Two-phase updates [37] 
use configuration version tags to guarantee consistency during 
updates. However, as per [37], two-phase updates cannot 
guarantee congestion freedom, and are therefore not effective 
in fiow swap scenarios, such as Fig. 1. Hence, in fiow swap 
scenarios the order approach and the two-phase approach 
produce the same result as the simple-minded approach, in 
which the controller sends the update commands as close as 
possible to instantaneously, and hopes for the best. 

In this paper we present Time4, an update approach that is 
most effective in flow swaps and other deadlock [28] scenarios, 
such as Fig. 1. We refer to update approaches that do not use 
time as untimed update approaches. 

In SWAN [23], the authors suggest that reserving un¬ 
used scratch capacity of 10-30% on every link can allow 
congestion-free updates in most scenarios. The B4 [24] ap¬ 
proach prevents packet loss during path updates by temporarily 
reducing the bandwidth of some or all of the fiows. Our ap¬ 
proach does not require scratch capacity, and does not reduce 
the bandwidth of fiows during network updates. Furthermore, 
in this paper we show that variants of SWAN and B4 that make 
use of Time4 can perform better than the original versions. 

A recently published work by Fox Networks [29] shows that 
accurately timed path updates are essential for video swapping. 
We analyze this use case further in Section IV. 

Rearrangeably non-blocking topologies (e.g., [38]) allow 

new traffic flows to be added to the network by rearranging 
existing fiows. The analysis of flow swaps presented in this 
paper emphasizes the requirement to perform simultaneous 
reroutes during the rearrangement procedure, an aspect which 
has not been previously studied. 

Preliminary work-in-progress versions of the current paper 
introduced the concept of using time in SDN [39] and the fiow 
swapping scenario [40]. The use of time for consistent updates 
was discussed in [31]. TimeFlip [30] presented a practical 
method of implementing timed updates. The current work is 
the first to present a generic protocol for performing timed 
updates in SDN, and the first to analyze fiow swaps, a natural 
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application in which timed updates are the optimal update 
approach. 

E. Contributions 

The main contributions of this paper are as follows: 

• We consider a class of network update scenarios called 
flow swaps, and show that simultaneous updates using 
synchronized clocks are provably the optimal approach 
of implementing them. In contrast, existing approaches 
for consistent updates (e.g., [28], [37]) are not applicable 
to flow swaps, and other update approaches such as 
SWAN [23] and B4 [24] can perform flow swaps, but 
at the expense of increased resource overhead. 

• We use game-theoretic analysis to show that flow swaps 
are inevitable in the dynamic nature of SDN. 

• We present the design, implementation and evaluation of 
a prototype that performs timed updates in OpenFlow. 

• Our work includes an extension to the OpenFlow protocol 
that has been approved by the ONF and integrated into 
OpenFlow 1.5 [41], and into the OpenFlow 1.3.x exten¬ 
sion package [42]. The source code of our prototype is 
publicly available [43]. 

• We present experimental results that demonstrate the ad¬ 
vantage of timed updates over existing approaches. More¬ 
over, we show that existing update approaches (SWAN 
and B4) can be improved by using accurate time. 

• Our experiments include an emulation of an SDN- 
controlled video swapping scenario, a real-life use case 
that has been shown [29] to be infeasible with previous 
versions of OpenFlow, which did not include our time 
extension. 

II. The Lossless Flow Allocation (LFA) Problem 
A. Inevitable Flow Swaps 

Fig. 1 presents a scenario in which it is necessary to swap 
two flows, i.e., to update two switches at the same time. In 
this section we discuss the inevitability of flow swaps; we 
show that there does not exist a controller routing strategy 
that avoids the need for flow swaps. 

Our analysis is based on representing the flow-swap problem 
as an instance of an unsplittable flow problem, as illustrated 
in Fig. 2b. The topology of the graph in Fig. 2b models the 
traffic behavior to a given destination in common multi-rooted 
network topologies such as fat-tree and Clos (Fig. 2a). 

The unsplittable flow problem [44] has been thoroughly 
discussed in the literature; given a directed graph, a source 
node s, a destination node d, and a set of flow demands 
(commodities) between s and d, the goal is to maximize the 
traffic rate from the source to the destination. In this paper we 
deflne a game between two players: a source^ that generates 
traffic flows (commodities) and a controller that reconflgures 
the network forwarding rules in a way that allows the network 
to forward all traffic generated by the source without packet 
losses. 

^The source player does not represent a malicious attacker; it is an 
‘adversary’, representing the worst-case scenario. 




Fig. 2: Modeling a Clos topology as an unsplittable flow graph. 

Our main argument, phrased in Theorem 1, is that the source 
has a strategy that forces the controller to perform a flow swap, 
i.e., to reconflgure the path of two or more flows at the same 
time. Thus, a scenario in which multiple flows must be updated 
at the same time is inevitable, implying the importance of 
timed updates. 

Moreover, we show that the controller can be forced to 
invoke n individual commands that should optimally be per¬ 
formed at the same time. Update approaches that do not use 
time, also known as untimed approaches, cause the updates to 
be performed over a long period of time, potentially resulting 
in slow and possibly erratic response times and signiflcant 
packet loss. Timed coordination allows us to perform the 
n updates within a short time interval that depends on the 
scheduling error. 

Although our analysis focuses on the topology of Fig.2b, it 
can be shown that the results are applicable to other topologies 
as well, where the source can force the controller to perform 
a swap over the edges of the min-cut of the graph. 

B. Model and Deflnitions 

We now introduce the lossless flow allocation (LFA) prob¬ 
lem; it is not presented as an optimization problem, but rather 
as a game between two players: a source and a controller. As 
the source adds or removes flows (commodities), the controller 
reconflgures the forwarding rules so as to guarantee that all 
flows are forwarded without packet loss. The controller’s goal 
is to And a forwarding path for all the flows in the system 
without exceeding the capacity of any of the edges, i.e., to 
completely avoid loss of packets from the given flows. The 
source’s goal is to progressively add flows, without exceeding 
the network’s capacity, forcing the controller to perform a flow 
swap. We shall show that the source has a strategy that forces 
the controller to swap traffic flows simultaneously in order to 
avoid packet loss. 
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Our model makes three basic assumptions: (i) each flow 
has a fixed bandwidth, (ii) the controller strives to avoid 
packet loss, and (hi) flows are unsplittable. We discuss these 
assumptions further in Sec. V. 

The term flow in classic flow problems typically refers to 
the amount of traffic that is forwarded through each edge of 
the graph. Since our analysis focuses on SDN, we slightly 
divert from the common flow problem terminology, and use 
the term flow in its OpenFlow sense, i.e., a set of packets 
that share common properties, such as source and destination 
network addresses. A flow in our context, can be seen as a 
session between the source and destination that runs traffic at 
a flxed rate. 

The network is represented by a directed weighted acyclic 
graph (Fig. 2b), G = (V, E’, c), with a source 5, a destination 
d, and a set of intermediate nodes, Thus, V = Yin U 
{ 5 , d}. The nodes directly connected to 5 are denoted by O = 
{oi, 02 ,..., On}. Each of the outgoing edges from the source 
s has an inflnite capacity, whereas the rest of the edges have 
a capacity c. For the sake of simplicity, and without loss of 
generality, throughout this section we assume that c = 1. Such 
a graph G is referred to as an LFA graph. 

The source node progressively transmits traffic flows to¬ 
wards the destination node. Each flow represents a session 
between s and d\ every flow has a constant bandwidth, and 
cannot be split between two paths. A centralized controller 
conflgures the forwarding policy of the intermediate nodes, 
determining the path of each flow. Given a set of flows from s 
to d, the controller’s goal is to conflgure the forwarding policy 
of the nodes in a way that allows all flows to be forwarded 
to d without exceeding the capacity of any of the edges. 

The set of flows that are generated by 5 is denoted by F ::= 
{Fi, F 2 ,..., Fk}. Each flow Fi is deflned as Fi ::= (i, fi.Vi), 
where i is a unique flow index, is the bandwidth satisfying 
0 < /i < c, and denotes the node that the controller 
forwards the flow to, i.e., Vi G {oi, 02 , ..., 

It is assumed that the controller monitors the network, and 
thus it is aware of the flow set F. The controller maintains a 
forwarding function, Rcon • ^ x Yin —^ ^in U {d}. Every 
node (switch) has a flow table, consisting of a set of entries', 
an element w e¥ x Yin is referred to as an entry for short. 
An update of Rcon is deflned to be a partial function u : 
F X Yin Yin U {d}. We deflne a reroute as an update u that 
has a single entry in its domain. We call an update that has 
more than one entry in its domain a swap, and it is assumed 
that all updates in a swap are performed at the same time. We 
deflne a /c-swap for /c > 2 as a swap that updates entries in at 
least k different nodes. Note that a /c-swap is possible only if 
n > k, where n is the number of nodes in O. We focus our 
analysis on 2-swaps, and throughout the section we assume 
that n > 2. In Section II-E we discuss /c-swaps for values of 
k>2. 

C. The LFA Game 

The lossless flow allocation problem can be viewed as a 
game between two players, the source and the controller. The 
game proceeds by a sequence of steps; in each step the source 


either adds or removes a single flow (Fig. 3), and then waits 
for the controller to perform a sequence of updates (Fig. 4). 
The source’s strategy Ss(F, Rcon) = (<^5 F), is a function that 
deflnes for each flow set F and forwarding function Rcon for 

F, a pair (a, F) representing the source’s next step, where 

a G {Add^ Remove} is the action to be taken by the source, 
and F = {j, fji'^j) is a single flow to be added or removed. 
The controller’s strategy is deflned by ‘^con{Rcon-,ci^R) — U, 
where U = ... ^ui} is a sequence of updates, such that 

(i) at the end of each update no edge exceeds its capacity, and 

(ii) at the end of the last update, U£, the forwarding function 
Rcon deflnes a forwarding path for all flows in F. Notice that 
when a flow is to be removed, the controller’s update is trivial; 
it simply removes all the relevant entries from the domain of 
Rcon- Hence our analysis focuses on adding new flows. 

The following theorem, which is the crux of this section, 
argues that the source has a strategy that forces the controller 
to perform a swap, and thus that flow swaps are inevitable 
from the controller’s perspective. 

Theorem 1. Let G be an LFA graph. In the LFA game over 

G, there exists a strategy, Sg, for the source that forces every 
controller strategy, Scon, to perform a 2-swap. 

Proof: Let m be the number of incoming edges to the 
destination node d in the LFA graph (see Fig 2b). For m = 1 
the claim is trivial. Hence, we start by proving the claim 
for m = 2, i.e., there are two edges connected to node d, 
edges ei and 62 . We show that the source has a strategy that, 
regardless of the controller’s strategy, forces the controller to 
use a swap. In the first four steps of the game, the source 
generates four flows, Fi = (l,0.35,oi), F 2 = (2,0.35,oi), 
F 3 = ( 3 , 0 . 45 , 02 ), and F 4 = ( 4 , 0 . 45 , 02 ), respectively. 
According to the Source Procedure of Fig. 3, after each flow 
is added, the source waits for the controller to update Rcon 
before adding the next flow. After the flows are added, there 
are two possible cases: 

(a) The controller routes symmetrically through ei and 62 , 
i.e. a flow of 0.35 and a flow of 0.45 through each of the 
edges. In this case the source’s strategy at this point is to 
generate a new flow F 5 = (5,0.3,oi) with a bandwidth 
of 0.3. The only way the controller can accommodate F 5 
is by routing Fi and F 2 through the same edge, allowing 
the new 0.3 flow to be forwarded through that edge. Since 
there is no sequence of reroute updates that allows the 


Source Procedure 

1 F^0 

2 repeat at every step 

3 (a, F) ^ S5(F, Fcon) 

4 if a = Add 

5 F ^ FUF 

6 Wait for the controller to complete updates 

7 else II a = Remove 

8 F ^ F\F 

Fig. 3: The LFA game: the source’s procedure. 
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Controller Procedure 

1 repeat at every step 

2 {ui,...,ui} ^ 

^con {R 

con-) ^5 F) 

3 forje[l,^] 

4 Update Rcon according to Uj 

Fig. 4: The LFA game: the controller’s procedure. 


controller to reach the desired Rcon, the only way to reach 
a state where Fi and F 2 are routed through the same edge 
is to swap a 0.35 flow with a 0.45 flow. Thus, by issuing 
F 5 the controller forces a flow swap as claimed. 

(b) The controller routes Fi and F 2 through one edge, and F 3 
and F 4 through the other edge. In this case the source’s 
strategy is to generate two flows, Fq and Fj, with a 
bandwidth of 0.2 each. The controller must route Fq 
through the edge with Fi and F 2 . Now each path sustains 
a bandwidth of 0.9 units. Thus, when Fj is added by the 
source, the controller is forced to perform a swap between 
one of the 0.35 flows and one of the 0.45 flows. 

In both cases the controller is forced to perform a 2-swap, 
swapping a flow from oi with a flow from 02 . This proves the 
claim for m = 2 . 

The case of m > 2 is obtained by reduction to m = 2: the 
source first generates m — 2 flows with a bandwidth of 1 each, 
causing the controller to saturate m — 2 edges connected to 
node d (without loss of generality 63 ,..., e^). At this point 
there are only two available edges, ei and 62 . From this point, 
the proof is identical to the case of m = 2 . ■ 

The proof of Theorem 1 showed that the controller can be 
forced to perform a flow swap that involves m = 2 paths. For 
m > 2, we assumed that the source saturates m — 2 paths, 
reducing the analysis to the case of m = 2. In the following 
theorem we show that for m > 2 the controller can be forced 
to perform swaps. 

Theorem 2. Let G be an LFA graph. In the LFA game over 
G, if m > 2 then there exists a strategy, for the source 
that forces every controller strategy, Scon, to perform 
2-swaps. 

Proof: Assume that m is even. The source generates m 
flows with a bandwidth of 0.35, m flows with a bandwidth 
of 0.45, and m flows with a bandwidth of 0.2. The only way 
the controller can route these flows without packet loss is as 
follows: each path sustains three flows with three different 
bandwidths, 0.2, 0.35, and 0.45. Now the source removes the 
m flows of 0.2, and adds y flows of 0.3. As in case (a) of the 
proof of Theorem 1, adding each flow of 0.3 causes a 2-swap. 
The controller is thus is forced to perform y = swaps. 

If m is odd, then the source can saturate one of the edges 
by generating a flow with a bandwidth of 1 , and then repeat 
the procedure above for the remaining m — 1 edges, yielding 
= [f J swaps. ■ 

For simplicity, throughout the rest of this section we assume 
that m = 2. However, as in Theorem 2, the analysis can be 
extended to the case of m > 2 . 


D. The Impact of Flow Swaps 

We deflne a metric for flow swaps, by considering the 
oversubscription that is caused if the flows are not swapped 
simultaneously, but updated using an untimed approach. 

We deflne the oversubscription of an edge, e, with respect to 
a forwarding function, Rcon, to be the difference between the 
total bandwidth of the flows forwarded through e according to 
Rcon, and the capacity of e. If the total bandwidth of the flows 
through e is less than the capacity of e, the oversubscription 
is deflned to be zero. 

Definition 1 (Flow swap impact). Let ¥ be a flow set, and 
Rcon be the corresponding forwarding function. Consider a 
2-swap u : ¥ xY U {d}, such that u = uiC U 2 , where 
Ui = {wi^Vi), for Wi ^¥ xY, Vi ^Y {d}, and i G {1, 2 }. 
The impact of u is defined to be the minimum of: (i) The 
oversubscription caused by applying ui to Rcon, or (ii) the 
oversubscription caused by applying U 2 to Rcon- 

Example 2. We observe the scenario described in the proof of 
Theorem 1, and consider what would happen if the two flows 
had not been swapped simultaneously. The scenario had two 
cases; in the first case, the bandwidth through each edge A 0.8 
before the controller swaps a 0.35 flow with a 0.45 ^ 6 >w. Thus, 
if the 0.35 flow is rerouted and then the 0.45 flow, the total 
bandwidth through the congested edge is 0.8 + 0.35 = 1.15, 
creating a temporary oversubscription o/0.15. Thus, the flow 
swap impact in the first case is 0.15. In the second case, one 
edge sustains a bandwidth of 0.7, and the other a bandwidth of 
0.9. The controller needs to swap a 0.3b flow with a O.Abflow. 
If the controller first reroutes the 0.45 flow, then during the 
intermediate transition period, the congested edge sustains a 
bandwidth of 0.7 + 0.45 = 1.15, and thus it is oversubscribed 
by 0.15. Hence, the impact in the second case is also 0.15. 

The following theorem shows that in the LFA game, the 
source can force the controller to perform a flow swap with a 
swap impact of roughly 0.5. 

Theorem 3. Let G he an LFA graph, and let 0 < a < 0.5. 
In the LFA game over G, there exists a strategy, § 5 , for the 
source that forces every controller strategy. Scon, to perform 
a swap with an impact of a. 

Proof: Let e = 0.1 — 0.2 - a. We use the source’s strategy 
from the proof of Theorem 1, with the exception that the band- 
widths /i,..., /7 of flows Fi,..., F 7 are: /i = /2 = 0.5-2e, 
/s = /4 = 0.5 - e, /s = 4e, and /g = /7 = 3e. 

As in the proof of Theorem 1, there are two possible cases. 
In case (a), the controller routes symmetrically through the 
two paths, utilizing 1 — 3e of the bandwidth of each path. 
The source adds F 5 in response. To accommodate F 5 the 
controller swaps Fi and F 3 . We determine the impact of 
this swap by considering the oversubscription of performing 
an untimed update; the controller first reroutes Fi, and only 
then reroutes F 3 . Hence, the temporary oversubscription is 
1 — 3e + 0.5 — 2e — 1 = 1.5 — 5e — 1. Thus, the impact 
is 0.5 — 5e = a. In case (b), the controller forwards Fi 
through the same path as F 2 , and F 3 through the same path 
as F 4 . The source responds by generating Fq and F 7 . Again, 
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the controller is forced to swap between Fi and F 3 . We 
compute the impact by considering an untimed update, where 
the controller reroutes F 3 first, causing an oversubscription of 
1 — 4e + 0.5 — e — 1 = 0.5 — 5e = a. In both cases the source 
inflicts a flow swap with an impact of a. ■ 

Intuitively, Theorem 3 shows that not only are flow swaps 
inevitable, but they have a high impact on the network, as they 
can cause links to be congested by roughly 50% beyond their 
capacity. 


E. Network Utilization 

Theorem 1 demonstrates that regardless of the controller’s 
policy, flow swaps cannot be prevented. However, the proof 
of Theorem 1 uses a scenario in which the edges leading to 
node d are almost fully utilized, suggesting that perhaps flow 
swaps are inevitable only when the traffic bandwidth is nearly 
equal to the max-flow of the graph. Arguably, as suggested 
in [23], by reserving some scratch capacity o • c through each 
of the edges, for 0 < z/ < 1, it may be possible to avoid flow 
swaps. In the next theorem we show that if z/ then flow 
swaps are inevitable. 

Theorem 4. Let G be an LEA graph, in which a scratch 
capacity of a is reserved on each of the edges ei,..., e^, and 
let a < In the LEA game over G, there exists a strategy for 
the source, Sg, that forces every controller strategy. Scon, to 
perform a swap. 

Proof: We consider a graph G', in which the capacity of 
each of the edges ei,..., is 1 — z/. By Theorem 3, for every 
0 < a < 0.5, there exists a strategy for the source that forces 
a flow swap with an impact of a. Thus, there exists a strategy 
that forces at least one of the edges to sustain a bandwidth of 
a • (1 — z/). Since z/ < we have (1 — z/) > |, and thus there 
exists an a < 0.5 such that a • (1 — z/) > 1. It follows that in 
the original graph G, with scratch capacity z/, there exists a 
strategy for the source that forces the controller to perform a 
flow swap in order to avoid the oversubscribed bandwidth of 
a • (1 - z/) > 1 . ■ 

The analysis of [23] showed that a scratch capacity of 10% 
is enough to address the reconfiguration scenarios that were 
considered in that work. Theorem 4 shows that even a scratch 
capacity of 331% does not suffice to prevent flow swaps 
scenarios. It follows that the 10% reserve that [23] suggest 
may not be sufficient in general for lossless reconfiguration. 


E n-Swaps 

As defined above, a /c-swap is a swap that involves k or 
more nodes. In previous subsections we discussed 2-swaps. 
The following theorem generalizes Theorem 1 to n-swaps, 
where n is the number of nodes in O. 

Theorem 5. Let G be an LEA graph. In the LEA game over 
G, there exists a strategy, for the source that forces every 
controller strategy. Scon, to perform an n-swap. 

Proof: For n = 1, the claim is trivial. For n = 2, the 
claim was proven in Theorem 1. Thus, we assume n > 3. 


If m > 2 , the source first generates m — 2 flows with a rate c 
each, and we assume without loss of generality that after the 
controller allocates these flows only ei and 62 remain unused. 
Thus, we focus on the case where m = 2. 

We describe a strategy, S^ as required; 5 generates three 
types of flows: 

• Type A: two flows Fi,F 2 , at a rate of h each: Fi = 
( 1 , h, oi), and F 2 = ( 2 , h, oi). 

• Type B: n flows, F 3 ,..., Fn+ 2 , with a total rate g, i.e., 
at a rate of ^ each. The source sends each of the n flows 

n 

through a different node of O. 

• Type C: n — 1 flows, F^+ 3 ,..., F' 2 n+i with a total rate g, 
i.e., each. The source sends each of the n — 1 flows 
through a different node of 02 ,..., o^. 

We define h and g such that: 

\<h<9<\ ( 1 ) 


g > (n^ — n)(l — 2h) 


( 2 ) 


We claim that for every n there exist g and h that satisfy (1) 
and (2). We prove this claim by finding g and h that satisfy the 
two conditions. We choose an arbitrary g in the range 
We find a valid h by solving g > {n? — n)(l — 2h). The latter 
yields h> \ . Since n > 3, we have — n > 6 , and 


thus 


2 2(n2-n)- 

^ Clearly, 


> 0. It follows 


2(n2-n) ^ 2x6 24’ 2(n2-n) 

that every h that satisfies \ — ^<h<\ — t), also satisfies 
h> \. Hence, every g and h in the range (^, |) that satisfy 
h < g, also satisfy ( 1 ) and ( 2 ). 

Intuitively, for h and g sufficiently close to 2 (but less than 
2 ) ( 1 ) and ( 2 ) are satisfied. 


We now prove that after generating the flows Fi,..., F 2 n+i, 
the function Rcon forwards all type B flows through the same 
path, and all type C flows through the same path. Assume by 
way of contradiction that there is a forwarding function Rcon 
that forwards flows Fi,..., F 2 n+i without loss, but does not 
comply to the latter claim. We consider two distinct cases: 
either the two type A flows are forwarded through the same 
edge, or they are forwarded through two different edges. 

• If the two type A flows are forwarded through two 
different paths, then we assume that Fi and the n type 
B flows are forwarded through ei and that F 2 and the 
n — 1 type C flows are forwarded through 62 . Thus, at 
this point each of the two edges sustains traffic at a rate 
of g h. By the assumption, there exists an update that 
swaps i < n flows of type B with j < n — 1 flows 
of type C, such that after the swap none of the edges 
exceeds its capacity. Thus, the update adds the bandwidth 
\j • — F ^ I to one of the edges, and this additional 

bandwidth must fit into the available bandwidth before 
the update, 1— g — h. Hence, \j • < 0 — g — h. 

Note that 1—g — h<l — 2h< following 

(1) and (2). Thus we get |j • - z • f | < ;^ - f. It 

follows that |j-n — Fn + i| < 1. Since j, ^ ^re integers, 
we get that j-n — Fn + i = 0, and thus j = i - . Now 

since i < n and j < n — 1 are both natural numbers. 
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Fig. 5: A Scheduled Bundle: the Bundle Commit message may include Tg, the scheduled time of execution. The controller 
can use a Bundle Discard message to cancel the Scheduled Bundle before time Tg. 


the only solution is j = n — 1 and i = n, which means 
that the flows from type B are all forwarded through the 
same path, as well as the flows of type C, contradicting 
the assumption. 

• If the two type A flows are forwarded through the same 

edge, their total bandwidth is 2h, and thus the remaining 
bandwidth through this edge is l — 2h. From (2) we have 
^ f > 1 - 2/i. We note that (i) ^ ^ f. and 

(ii) - > . It follows that > 1 — 2h, and also 

^ > l — 2h, and thus none of the type B or type C flows 
fit on the same path with Fi and F2. Thus, all the type 
B and type C flows are on the same path, contradicting 
the assumption. 

We have shown that all flows of type B, denoted by F^, 
must be forwarded through the same path, and that all flows 
of type C, denoted by F^, are forwarded through the same 
path. Thus, after the source generates the 2 • n +1 flows, there 
are two possible scenarios: 

• The two type A flows are forwarded through the same 
path, and the type B and type C flows are forwarded 
through the other path. In this case s generates two flows 
at a rate of 1 — h — g each. To accommodate both flows 
the controller must swap the flows of F^ with Fi or 
the flows of F^ with F2. Both possible swaps involve 
n entries, and thus the controller is force to perform an 
n-swap. 

• One path is used for Fi and the flows of F^, and the other 
path is used for F2 and the flows of F^. In this case the 
source generates a flow with a bandwidth of 1 — 2/i, again 
forcing the controller to swap the flows of F^ with Fi 
or the flows of F^ with F2 . 

In both cases the controller is forced to perform a swap that 
involves the n nodes, i.e., an n-swap. ■ 

III. Design and Implementation 
A. Protocol Design 

1) Overview 

A TlME4-enabled system is comprised of two main com¬ 
ponents: 

• OpenFlow time extension. Time4 is built upon the 
OpenFlow protocol. We define an extension to the Open- 
Flow protocol that enables timed updates; the controller 


can attach an execution time to every OpenFlow com¬ 
mand it sends to a switch, defining when the switch 
should perform the required command. It should be noted 
that the Time4 approach is not limited to OpenFlow; we 
have defined a similar time extension to the NETCONF 
protocol [45], but in this paper we focus on Time4 in the 
context of OpenFlow, as described in the next subsection. 

• Clock synchronization. Time4 requires the switches 
and controller to maintain a local clock, allowing time- 
triggered events. Hence, the local clocks should be syn¬ 
chronized. The OpenFlow time extension we defined 
does not mandate a specific synchronization method. 
Various mechanisms may be used, e.g., the Network Time 
Protocol (NTP), the Precision Time Protocol (PTP) [5], or 
GPS-based synchronization. The prototype we designed 
and implemented uses ReversePTP [46], as described 
below. 

2) OpenFlow Time Extension 

We present an extension that allows OpenFlow controllers 
to signal the time of execution of a command to the switches. 
This extension is described in full in Appendix A.^ 

Our extension makes use of the OpenFlow [19] Bundle 
feature; a Bundle is a sequence of OpenFlow messages from 
the controller that is applied as a single operation. Our time 
extension defines Scheduled Bundles, allowing all commands 
of a Bundle to come into effect at a pre-determined time. This 
is a generic means to extend all OpenFlow commands with 
the scheduling feature. 

Using Bundle messages for implementing Time4 has two 
significant advantages: (i) It is a generic method to add the 
time extension to all OpenFlow commands without changing 
the format of all OpenFlow messages; only the format of 
Bundle messages is modified relative to the Bundle message 
format in [19], optionally incorporating an execution time, 
(ii) The Scheduled Bundle allows a relatively straightforward 
way to cancel scheduled commands, as described below. 

Fig. 5 illustrates the Scheduled Bundle message procedure. 
In step 1, the controller sends a Bundle Open message to 
the switch, followed by one or more Add messages (step 2). 
Every Add message encapsulates an OpenFlow message, e.g., 

^A preliminary version of this extension was presented in [47]. 








a FLOW_MOD message. A Bundle Close is sent in step 3, 
followed by the Bundle Commit (step 4), which optionally 
includes the scheduled time of execution, Tq. The switch then 
executes the desired command(s) at time Tg. 

The Bundle Discard message (step 5') allows the controller 
to enforce an all-or-none scheduled update; after the Bundle 
Commit is sent, if one of the switches sends an error message, 
indicating that it is unable to schedule the current Bundle, the 
controller can send a Discard message to all switches, cancel¬ 
ing the scheduled operation. Hence, when a switch receives a 
scheduled commit, to be executed at time T^, the switch can 
verify that it can dedicate the required resources to execute the 
command as close as possible to Tg. If the switch’s resources 
are not available, for example due to another command that is 
scheduled to Tg , then the switch replies with an error message, 
aborting the scheduled commit. Significantly, this mechanism 
allows switches to execute the command with a guaranteed 
scheduling accuracy, avoiding the high variation that occurs 
when untimed updates are used. 

The OpenFlow time extension also defines Bundle Fea¬ 
ture Request messages, which allow the controller to query 
switches about whether they support Scheduled Bundles, and 
to configure some of the switch parameters related to Sched¬ 
uled Bundles. 

3) Clock Synchronization: ReversePTP 

In the last decade PTP, based on the IEEE 1588 [5] stan¬ 
dard, has become a common feature in commodity switches, 
typically providing a clock accuracy on the order of 1 mi¬ 
crosecond. 

In [46], [48] we introduced ReversePTP a PTP variant 
for SDNs. ReversePTP is based on PTP, but is conceptually 
reversed. In PTP a single node periodically distributes its time 
to the other nodes in the network. In ReversePTP all nodes 
in the network (the switches) periodically distribute their time 
to a single node (the controller). The controller keeps track of 
the offsets, denoted by offset^ for switch i, between its clock 
and each of the switches’ clocks, and uses them to send each 
switch individualized timed commands. 

ReversePTP allows the complex clock algorithms to be 
implemented by the controller, whereas the ‘dumb’ switches 
only need to distribute their time to the controller. Following 
the SDN paradigm, the ReversePTP algorithmic logic can be 
programmed and dynamically tuned at the controller without 
affecting the switches. 

Another advantage of ReversePTP, which played an im¬ 
portant role in our experiments, is that ReversePTP allows 
the controller to keep track of the synchronization status of 
each clock; a clock synchronization protocol requires a long 
setup time, typically tens of minutes. ReversePTP provides 
an indication of when the setup process has completed. 

As shown in [46], ReversePTP can be effectively used to 
perform timed updates; in order to have switch i perform a 
command at time Tg, the controller instructs i to perform the 
command at time T], where T] = Tg -foffseti takes the offset 



Fig. 6: ReversePTP in SDN: switches distribute their time 
to the controller. Switches’ clocks are not synchronized. For 
every switch i, the controller knows offseti between switch 
i’s clock and its local clock. 


I 

0 I 
o 

l> 

(0 I 

c ^ 
0 
Q. 


Controller 


SDN application 
using time-based updates 


Time 
update 


■based! _ _ _ _ 

date ▼ "T Toffseti 

iliili 


OpenFlow Agent 
DpctI 


$ t 

PTPd Slave i 


OpenFlow protocol 
using time extension 


PTP 


OpenFlow Switch 
CPqD OFSoftswitch 


PTPd Master 

Clock 
© 


^d Master 


Switch i 


Fig. 7: Time 4 prototype design: the black blocks are the 
components implemented in the context of this work. 


between the controller and switch i into account,^ causing i 
to perform the action at time Tg according to the controller’s 
clock. 


B. Prototype Design and Implementation 

We have designed and implemented a software-based pro¬ 
totype of Time 4, as illustrated in Fig. 7. The components we 
implemented are marked in black. These components run on 
Linux, and are publicly available as open source [43]. 

Our TiME4-enabled OFSoftswitch prototype was adopted 

as described above is a first order approximation of the desired 
execution time. The controller can compute a more accurate execution time 
by also considering the clock skew and drift, as discussed in [46]. 




























9 


by the ONF as the official prototype of Scheduled Bundles."^ 

Switches. Every switch i runs an OpenFlow switch software 
module. Our prototype is based on the open source CPqD OF- 
Softswitch [49],^ incorporating the switch scheduling module 
(see Fig. 7) that we implemented. When the switch receives 
a Scheduled Bundle from the controller, the switch scheduling 
module schedules the respective OpenFlow command to the 
desired time of execution. The switch scheduling module also 
handles Bundle Feature Request messages received from the 
controller. 

Each switch runs a ReversePTP master, which distributes 
the switch’s time to the controller. Our ReversePTP proto¬ 
type is a lightweight set of Bash scripts that is used as an 
abstraction layer over the well-known open source PTPd [50] 
module. Our software-based implementation uses the Finux 
clock as the reference for PTPd, and for the switch’s schedul¬ 
ing module. To the best of our knowledge, ours is the first 
open source implementation of ReversePTP. 

Controller. The controller runs an OpenFlow agent, which 
communicates with the switches using the OpenFlow protocol. 
Our prototype uses the CPqD Dpctl (Datapath Controller), 
which is a simple command line tool for sending OpenFlow 
messages to switches. We have extended Dpctl by adding 
the time extension; the Dpctl command-line interface allows 
the user to define the execution time of a Bundle Commit. 
Dpctl also allows a user to send a Bundle Feature Request to 
switches. 

The controller runs ReversePTP with n instances of PTPd 
in slave mode, where n is the number of switches in the net¬ 
work. One or more SDN applications can run on the controller 
and perform timed updates. The application can extract the 
offset, offset^, of every switch i from ReversePTP, and use 
it to compute the scheduled execution time of switch i in every 
timed update. The Finux clock is used as a reference for PTPd, 
and for the SDN application(s). 

IV. Evaluation 
A. Evaluation Method 

Environment. We evaluated our prototype on a 71-node 
testbed in the DeterFab [51] environment. Each machine (PC) 
in the testbed either played the role of an OpenFlow switch, 
running our TlME4-enabled prototype, or the role of a host, 
sending and receiving traffic. A separate machine was used 
as a controller, which was connected to the switches using an 
out-of-band network. 

We remark that we did not use Mininet [52] in our eval¬ 
uation, as Mininet is an emulation environment that runs 
on a single machine, making it impractical for emulating 
simultaneous or time-triggered events. We did, however, run 
our prototype over Mininet in some of our preliminary testing 
and verification. 

^The ONF process for adding new features to OpenFlow requires every 
new feature to be prototyped. 

^OFSoftswitch is one of the two software switches used by the Open 
Networking Foundation (ONF) for prototyping new OpenFlow features. We 
chose this switch since it was the first open source OpenFlow switch to include 
the Bundle feature. 


Performance attributes. Three performance attributes play 
a key role in our evaluation, as shown in Table I. 


A 

The average time elapsed between two consecutive messages 


sent by the controller. 

Ir 

Installation latency range: the difference between the maximal 


rule installation latency and the minimal installation latency. 

5 

Scheduling error: the maximal difference between the actual 


update time and the scheduled update time. 


TABFE I: Performance Attributes. 


Intuitively, A and Ir determine the performance of untimed 
updates. A indicates the controller’s performance; an Open- 
Flow controller can handle as many as tens of thousands [53] 
to millions [54] of packets per second, depending on the 
type of controller and the machine’s processing power. Hence, 
A can vary from 1 microsecond to several milliseconds. Ir 
indicates the installation latency variation. The installation 
latency is the time elapsed from the instant the controller sends 
a rule modification message until the rule has been installed. 
The installation latency of an OpenFlow rule modification 
(FFOW_MOD) has been shown to range from 1 millisecond 
to seconds [28], [55], and grows dramatically with the number 
of installations per second. 

The attribute that affects the performance of timed updates 
is the switches’ scheduling error, 5. When an update is 
scheduled to be performed at time Tq, it is performed in 
practice at some time t G [Tq,Tq + 6].^ The scheduling error, 
6, is affected by two factors: the device’s clock accuracy, 
which is the maximal offset between the clock value and 
the value of an accurate time reference, and the execution 
accuracy, which is a measure of how accurately the device 
can perform a timed update, given run-time parameters such 
as the concurrently executing tasks and the load on the 
device. The achievable clock accuracy strongly depends on the 
network size and topology, and on the clock synchronization 
method. For example, the clock accuracy using the Precision 
Time Protocol [5] is typically on the order of 1 microsecond 
(e.g., [6]). 

Software-based evaluation. Our experiments measure the 
three performance attributes in a setting that uses software 
switches. While the values we measured do not necessarily 
refiect on the performance of systems that use hardware-based 
switches, the merit of our evaluation is that we vary these 
parameters and analyze how they affect the network update 
performance with untimed approaches and with Time4. 

B. Performance Attribute Measurement 

Our experiments measured the three attributes. A, Ir, and 6, 
illustrating how accurately updates can be applied in software- 
based OpenFlow implementations. It should be noted that 
these three values depend on the processing power of the 
testbed machine; we measured the parameters for three types 
of DeterFab machines. Type I, II, and III, listed in Table II. 
Each attribute was measured 100 times on each machine type, 
and Fig. 8 illustrates our results. The figure graphically depicts 
the values A, Ir, and 6 of machine Type I as an example. 

^An alternative representation of the accuracy, S, assumes a symmetric 
error, To ± (5. The two approaches are equivalent. 
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Fig. 8: Measurement of the three performance attributes: (a) A, (b) Ir, and (c) S. 


Machine Type 

A 

Ir 

5 

I 

Intel Xeon E3 LP 

2.4 GHz, 16 GB RAM 

9.64 

1.3 

1.23 

II 

Intel Xeon 

2.1 GHz, 4 GB RAM 

9.6 

1.47 

1.18 

II 

Intel Dual Xeon 

3 GHz, 2 GB RAM 

14.27 

2.72 

1.19 


TABLE II: Measured attributes in milliseconds. 


The measured scheduling error, S, was slightly more than 
1 millisecond in ah the machines we tested. Our experiments 
showed that the clock accuracy using ReversePTP over the 
DeterLab testbed is on the order of 100 microseconds. The 
measured value of S in Table II shows the execution accu¬ 
racy, which is an order of magnitude higher. The installation 
latency range, Ir, was slightly higher than d, around 1 to 3 
milliseconds. The measured value of A was high, on the order 
of 10 milliseconds, as Dpctl is not optimized for performance. 

In software-based switches, the CPU handles both the data- 
plane traffic and the communication with the controller, and 
thus Ir and d can be affected by the rate of data-plane traffic 
through the switch. Hence, in our experiments we fixed the 
rate of traffic through each switch to 10 Mbps, allowing an 
‘apples-to-apples’ comparison between experiments. 

C Microbenchmark: Video Swapping 

To demonstrate how Time4 is used in a real-life scenario, 
we reconstructed the video swapping topology of [29], as 
illustrated in Fig. 9a. Two video cameras, A and B, transmit an 
uncompressed video stream to targets A and B, respectively. 
At a given point in time, the two video streams are swapped, 
so that the stream from source A is transmitted to target B, and 
the stream from B is sent to target A. As described in [29], the 
swap must be performed at a specific time instant, in which 
the video sources transmit data that is not visible to the viewer, 
making the swap unnoticeable. 

The authors of [29] noted that the precisely-timed swap 
cannot be performed by an OpenFlow switch, as currently 
OpenFlow does not provide abstractions for performing ac¬ 
curately timed changes. Instead, it uses source timing, where 



(a) Topology. (b) Video swapping accuracy. 


Fig. 9: Microbenchmark: video swapping. 


sources A and B are time-synchronized, and determine the 
swap time by using a swap indication in the packet header. The 
OpenFlow switch acts upon the swap indication to determine 
the correct path for each stream. We note that the main 
drawback of this source-timed approach is that the SMPTE 
2022-6 video streaming standard [56], which was used in [29], 
does not currently define an indication about where in the 
video stream a packet comes from, and specifically does 
not include an indication about the correct swapping time. 
Hence, off-the-shelf streaming equipment does not provide this 
indication. In [29], the authors used a dedicated Linux server 
to integrate the non-standard swap indication. 

In this experiment we studied how Time4 can tackle the 
video swapping scenario, avoiding the above drawback. Each 
node in the topology of Fig. 9a was emulated by a DeterLab 
machine. We used two 10 Mbps flows, generated by Iperf [57], 
to simulate the video streams. Each swap was initiated by 
the controller 100 milliseconds in advance (as in [29]): the 
controller sent a Scheduled Bundle, incorporating two updates, 
one for each of the flows. We repeated the experiment 100 
times, and measured the scheduling error. 

The measurement was performed by analyzing capture flies 
taken at the sources and at the switch’s egress ports. A swap 
that was scheduled to be performed at time T, was considered 
accurate if every packet that was transmitted by each of the 
source before time T was forwarded according to the old 
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configuration, and every packet that was transmitted after 
T was forwarded according to the new configuration. The 
scheduling error of each swap (measured in milliseconds) was 
computed as the number of misrouted packets, divided by 
the bandwidth of the traffic fiow. The sign of the scheduling 
error indicates whether the swap was performed before the 
scheduled time (negative error) or after it (positive error). 

Fig. 9b illustrates the empirical Probability Density Function 
(PDF) of the scheduling error of the swap, i.e., the difference 
between the actual swapping time and the scheduled swapping 
time. As shown in the figure, the swap is performed within 
±0.6 milliseconds of the scheduled swap time. We note that 
this is the achievable accuracy in a software-based OpenFlow 
switch, and that a much higher degree of accuracy, on the order 
of microseconds, can be achieved if two conditions are met: 
(i) A hardware switch is used, supporting timed updates with a 
microsecond accuracy, as shown in [30], and (ii) The cameras 
are connected to the switch over a single hop, allowing low 
latency variation, on the order of microseconds. 

D. Flow Swap Evaluation 

1) Experiment Setting 

We evaluated our prototype on a 71-node testbed under. 
We used the testbed to emulate an OpenFlow network with 
32 hosts and 32 leaf switches, as depicted in Fig. 11, with 
n = 32. 



Fig. 11: Experimental evaluation: every host and switch was 

emulated by a Linux machine in the DeterLab testbed. All 
links have a capacity of 10 Mbps. The controller is 
connected to the switches by an out-of-band network. 

Metric. A fiow swap that is not performed in a coordinated 
way may bare a high cost: either packet loss, deep buffering, 
or a combination of the two. We use packet loss as a metric 
for the cost of fiow swaps, assuming that deep buffering is not 
used. 

We used Iperf to generate flows from the sources to the 
destination, and to measure the number of packets lost between 
the source and the destination. 

The flow swap scenario. All experiments were flow swaps 
with a swap impact of 0.5.^ We used two static flows, 

^By Theorem 3, the source can force the controller to perform a flow swap 
with an impact as high as roughly 0.5. 


which were not reconfigured in the experiment: Hi generates 
a 5 Mbps flow that is forwarded through qi , and H 2 generates 
a 5 Mbps flow that is generated through q 2 . We generated n 
additional flows (where n is the number of switches at the 
bottom layer of the graph): (i) A 5 Mbps flow from Hi io 
the destination, (ii) n — 1 flows, each having a bandwidth of 
Mbps. Every flow swap in our experiment required the 
flow of (i) to be swapped with the n — 1 flows of (ii). Note 
that this swap has an impact of 0.5. 

2) Experimental Results 

Time4 vs. other update approaches. In this experiment 
we compared the packet loss of Time4 to other update 
approaches described in Sec. I-D. As discussed in Sec. I-D, 
applying the order approach or the two-phase approach to flow 
swaps produces similar results. This observation is illustrated 
in Eig. 10b. In the rest of this section we refer to these two 
approaches collectively as the untimed approaches. 

In our experiments we also implemented a SWAN- 
based [23] update, and a B4-based [24] update. In SWAN, we 
used a 10% scratch on each of the links, and in B4 updates 
we temporarily reduced the bandwidth of each flow by 10% 
to avoid packet loss. As depicted in Eig. 10b, SWAN and B4 
yield a slightly lower packet loss rate than Time4; the average 
number of packets lost in each Time4 flow swap is 0.2, while 
with SWAN and B4 only 0.1 packets are lost on average. 

To study the effect of using time in SWAN and in B4, 
we also performed hybrid updates, illustrated in Eig. 10c 
and lOd, and in the two right-most bars of Eig. 10b. We 
combined SWAN and Time4, by performing a timed update 
on a network with scratch capacity, and compared the packet 
loss to the conventional SWAN-based update. We repeated the 
experiment for various values of scratch capacity, from 0% to 
10%. As illustrated in Eig. 10c, the TlME4-fSWAN approach 
can achieve the same level of packet loss as SWAN with 
less scratch capacity. We performed a similar experiment 
with a timed B4 update, varying the bandwidth reduction rate 
between 0% and 10%, and observed similar results. 

Number of switches. We evaluated the effect of n, the 
number of switches involved in the flow swap, on the packet 
loss. We performed an n-swap with n = 2,4, 8,16, 32. As 
illustrated in Eig. 10a, the number of packets lost during an 
untimed update grows linearly with the number of switches n, 
while the number of packets lost in a Time4 update is less 
than one on average, and is not affected by the number of 
switches. As n increases, the update duration^ is longer, and 
hence more packets are lost during the update procedure. 

Controller performance. In this experiment we explored 
how the controller’s performance, represented by A, affects 
the packet loss rate in an untimed update. As A increases, the 
update procedure requires a longer period of time, and hence 
more packets are lost (Eig. 12) during the process. We note that 
although previous work has shown that A can be on the order 
of microseconds in some cases [54], Dpctl is not optimized 
for performance, and hence A in our experiments was on the 

^The update duration is the time elapsed from the instant the first switch 
is updated until the instant the last switch is updated. In our setting the update 
duration is roughly {n — 1)A. 
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Fig. 10: Flow swap performance: in large networks (a) Time4 allows significantly less packet loss than untimed approaches. 
The packet loss of Time4 is slightly higher than SWAN and B4 (b), while the latter two methods incur higher overhead. 
Combining Time4 with SWAN or B4 provides the best of both worlds; low packet loss (b) and low overhead (c and d). 


order of milliseconds. As shown in Fig. 12, we synthetically 
increased A, and observed its effect on the packet loss during 
flow swaps. 
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Fig. 12: The number of packets lost in a flow swap vs. A. 
The packet loss in Time4 is not affected by the controller’s 
performance (A). 

Installation latency variation. Our next experiment 
(Fig. 13a) examined how the installation latency variation, 
denoted by Ir, affects the packet loss during an untimed 
update. We analyzed different values of Ir: in each update we 
synthetically determined a uniformly distributed installation 
latency, I ^ U[0,Ir]. As shown in Fig. 13a, the switch’s 
installation latency range, Ir, dramatically affects the packet 
loss rate during an untimed update. Notably, when Ir is on the 
order of 1 second, as in the extreme scenarios of [28], [55], 
Time4 has a significant advantage over the untimed approach. 

Scheduling error. Figure 13b depicts the packet loss as a 
function of the scheduling error of Time4. By Fig. 10a, 13a 
and 13b, we observe that if S is sufficiently low compared 
to Ir and {n — 1)A, then Time4 outperforms the untimed 
approaches. Note that even if switches are not implemented 
with extremely low scheduling error 6, we expect Time4 to 
outperform the untimed approach, as typically 6 < Ir, 
further discussed in Section V. 

Summary. The experiments presented in this section 
demonstrate that Time4 performs significantly better than 
untimed approaches, especially when the update involves mul¬ 
tiple switches, or when there is a non-deterministic installation 
latency. Interestingly, Time4 can be used in conjunction with 
existing approaches, such as SWAN and B4, allowing the same 
level of packet loss with less overhead than the untimed 



(a) The number of packets lost (b) The number of packets lost 
in a flow swap vs. the in a flow swap vs. the 

installation latency range, Ir. scheduling error, S. 

Fig. 13: Performance as a function of Ir and S. Untimed 
updates are affected by the installation latency variation (Ir), 
whereas Time4 is affected by the scheduling error ((5). Time4 
is advantageous since typically 6 < Ir. 

variants. 

V. Discussion 

1) Scheduling accuracy 

The advantage of timed updates greatly depends on the 
scheduling accuracy, i.e., on the switches’ ability to accu¬ 
rately perform an update at its scheduled time. Clocks can typ¬ 
ically be synchronized on the order of 1 microsecond (e.g., [6]) 
using PTP [5]. However, a switch’s ability to accurately 
perform a scheduled action depends on its implementation. 

• Software switches: Our experimental evaluation showed 
that the scheduling error in the software switches we 
tested was on the order of 1 millisecond. 

• Hardware-based scheduling: The work of [30] has shown 
a method that allows the scheduling error of timed events 
in hardware switches to be as low as 1 microsecond. 

• Software-based scheduling in hardware switches: A 
scheduling mechanism that relies on the switch’s software 
may be affected by the switch’s operating system and by 
other running tasks. Measures can be taken to implement 
an accurate software-based scheduling in Time4: when 
a switch is aware of an update that is scheduled to 
take place at time T^, it can avoid performing heavy 
maintenance tasks at this time, such as TCAM entry 
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rearrangement. Update messages received slightly before 
time Ts can be queued and processed after the scheduled 
update is executed. Moreover, if a switch receives a timed 
command that is scheduled to take place at the same time 
as a previously received command, it can send an error 
message to the controller, indicating that the last received 
command cannot be executed. 

It is an important observation that in a typical system we 
expect the scheduling error to be lower than the installation 
latency variation, i.e., 5 < Ir. Untimed updates have a non- 
deterministic installation latency. On the other hand, timed 
updates are predictable, and can be scheduled in a way that 
avoids conflicts between multiple updates, allowing 5 to be 
typically lower than Ir. 

2) Model assumptions 

Our model assumes a lossless network with unsplittable, 
fixed-handwidth flows. A notable example of a setting in which 
these assumptions are often valid is a WAN or a carrier 
network. In carrier networks the maximal bandwidth of a 
service is dehned by its bandwidth prohle [27]. Thus, the 
controller cannot dynamically change the bandwidth of the 
flows, as they are determined by the SLA. The Frame Loss 
Ratio (FLR) is one of the key performance attributes [27] that a 
service provider must comply to, and cannot be compromised. 
Splitting a flow between two or more paths may result in 
packets being received out-of-order. Packet reordering is a 
key performance parameter in carrier-grade performance and 
availability measurement, as it affects various applications 
such as real-time media streaming [58]. Thus, ah packets of a 
flow are forwarded through the same path. 

3) Short term long term scheduling 

The OpenFlow time extension we presented in Section III 
is intended for short term scheduling; a controller should 
schedule an action to a near-future time, on the order of 
seconds in the future. The challenge in long term scheduling 
is that during the long period between the time at which the 
Scheduled Bundle was sent and the time at which it is meant to 
be executed various external events may occur: the controller 
may fail or reboot, or a second controller^ may try to perform 
a conflicting update. Near future scheduling guarantees that 
external events that may affect the scheduled operation such 
as a switch reboot have a low probability of occurring. Since 
near-future scheduling is on the order of seconds, this short 
potentially hazardous period is no worse than in conventional 
updates, where an OpenFlow command may be executed a 
few seconds after it was sent by the controller. 

4) Network latency 

In Fig. 1, the switches Si and S's are updated at the same 
time, as it is implicitly assumed that ah the links have the same 
latency. In the general case each link has a different latency, 
and thus Si and S^ should not be updated at the same time, 
but at two different times, Ti and T 3 , that account for the 
different latencies. 

^In an SDN with a distributed control plane, where more than one controller 
is used. 


5) Failures 

A timed update may fail to be performed in a coordinated 
way at multiple switches if some of the switches have failed, 
or if some of the controller commands have failed to reach 
some of the switches. Therefore, the controller uses a reliable 
transport protocol (TCP), in which dropped packets are re¬ 
transmitted. If the controller detects that a switch has failed, or 
failed to receive some of the Bundle messages, the controller 
can use the Bundle Discard to cancel the coordinated update. 
Note that the controller should send timed update messages 
sufficiently ahead of the scheduled time of execution, allowing 
enough time for possible retransmission and Discard message 
transmission. 

6) Controller performance overhead 

The prototype design we presented (Fig. 7) uses Re- 
VERSePTP [46] to synchronize the switch and the controllers. 
A synchronization protocol may yield some performance over¬ 
head on the controller and switches, and some overhead on the 
network bandwidth. In our experiments we observed that the 
CPU utilization of the PTP processes in the controller in an 
experiment with 32 switches was 5% on the weakest machine 
we tested, and signihcantly less than 1 % on the stronger 
machines. As for the network bandwidth overhead, accurate 
synchronization using PTP typically requires the controller to 
exchange ^ 5 packets per second per switch [59], a negligible 
overhead in high-speed networks. 


VI. Conclusion 

Time and clocks are valuable tools for coordinating up¬ 
dates in a network. We have shown that dynamic traffic 
steering by SDN controllers requires flow swaps, which are 
best performed as close to instantaneously as possible. Time- 
based operation can help to achieve carrier-grade packet loss 
rate in environments that require rapid path reconflguration. 
Our OpenFlow time extension can be used for implementing 
flow swaps and Time4. It can also be used for a variety 
of additional timed update scenarios that can help improve 
network performance during path and policy updates. 
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Appendix A 

A Time Extension to the OpenFlow Protocol 

A. Introduction 

This section defines a time extension to the OpenFlow protocol. This extension allows the controller to send OpenFlow 
commands that include an execution time, indicating to the switch when the respective command should be performed. 

As specified in [19], a bundle is a sequence of (one or more) OpenFlow modification requests from the controller that is 
applied as a single OpenFlow operation. The controller uses a commit message to apply the set of requests in the bundle. 
Consequently, the switch applies all messages in the bundle as a single operation or returns an error. 

This extension defines scheduled bundles', a bundle commit request may include an execution time, specifying when the 
bundle should be committed. A switch that receives a scheduled bundle, commits the bundle as close as possible to the 
execution time that was specified in the commit message. 

This document also defines the bundle features message, allowing the controller to retrieve information about the switch’s 
bundle support, and specifically about its scheduled bundle support. 


B. How It Works 

1) Overview 

This extension allows a bundle operation to be invoked at a scheduled time that is determined by the controller. 

The time-based bundle procedure is illustrated in Figure 14: 

1) The controller starts the bundle procedure by sending an OFPBCT_OPEN_REQUEST, and receives a reply from the 
switch. 

2) The controller then sends a set of N OFPT_BUNDLE_ADD_MESSAGE messages, for some > 1. 

3) The controller MAY then send an OFPBCT_CLOSE_REQUEST. The close request is optional, and thus the controller 
may skip this step. 

4) The controller sends an OFPBCT_COMMIT_REQUEST. The OFPBCT_COMMIT_REQUEST includes two time-related 
fields: the time flag and optionally the time property. When the time flag is set, it indicates that this is a scheduled 
commit. A scheduled commit request includes the time property field, which contains the scheduled time at which the 
switch is expected to apply the bundle. 

5) After receiving the commit message, the switch applies the bundle at the scheduled time, Tg, and sends a 
OFPBCT_COMMIT_REPLY to the controller. 


Time 

-► 





Commit includes Switch executes 
time property bundle at time Ts 


Fig. 14: Scheduled Bundle Procedure 
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Discarding scheduled bundles. The controller may cancel a scheduled commit by sending an OFPT_BUNDLE_CONTROL 
message with type OFPBCT_DISCARD_REQUEST. An example is shown in Figure 15; if the switch is not able to schedule 
the operation after receiving the commit message, it responds to the controller with an error message (see A-E). This indication 
may be used for implementing a coordinated update where either all the switches successfully schedule the operation, or the 
bundle is discarded; when a controller receives a scheduling error message from one of the switches it can send a discard 
message (step 5’ in in Figure 15) to other switches that need to commit a bundle at the same time, and abort the bundle. 


Time 

-► 



Controller sends 

Switch does not 

discard 

execute bundle 


Fig. 15: Discarding a Scheduled Commit 


2) Timekeeping and Synchronization 

Every switch that supports scheduled bundles must maintain a clock. It is assumed that clocks are synchronized by a method 
that is outside the scope of this document, e.g., the Network Time Protocol (NTP) or the Precision Time Protocol (PTP). 

Two factors affect how accurately a switch can commit a scheduled bundle; one factor is the accuracy of the clock 
synchronization method used to synchronize the switches’ clocks, and the second factor is the switch’s ability to execute 
real-time operations, which greatly depends on how it is implemented. 

This document does not define any requirements pertaining to the degree of accuracy of performing scheduled operations. 
However, every switch that supports the time extension is able to report its estimated scheduling accuracy to the controller. 
The controller can retrieve this information from the switch using the bundle features message, defined in Section A-D. 

Since a switch does not perform configuration changes instantaneously, the processing time of required operations should 
not be overlooked; in the context of the extension described in this paper the scheduled time and execution time always refer 
to the start time of the relevant operation. 

3) Scheduling Tolerance 

When a switch receives a scheduled commit message, it MUST verify that the scheduled time, T^, is not too far in the past 
or in the future. As illustrated in Figure 16, the switch verifies that Tg is within the scheduling tolerance range. 

The lower bound on Tg verifies the freshness of the packet so as to avoid acting upon old and possibly irrelevant messages. 
Similarly, the upper bound on Tg guarantees that the switch does not take a long-term commitment to execute an action that 
may become obsolete by the time it is scheduled to be invoked. 

The scheduling tolerance is determined by two parameters, sched_max_future and sched_max_past. The default 
value of these two parameters is 1 second. The controller MAY set these fields to a different value using the bundle features 
request, as described in Section A-D. 

If the scheduled time, Tg is within the scheduling tolerance range, the scheduled commit is performed; if Tg occurs in the 
past and within the scheduling tolerance, the switch applies the bundle as soon as possible. If Tg is a future time, the switch 
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Switch receives 
commit. 


H - -► Time 
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sched_max_past sched_max_future 


scheduling 

toierance 


Fig. 16: Scheduling Tolerance 


applies the bundle at Ts. If Ts is not within the scheduling tolerance range, the switch responds to the controller with an error 
message. 

C. Time-based Bundle Messages 

This section updates Section 7.3.9 of [19]. The reader is assumed to be familiar with Sections 6.8 and 7.3.9 of [19]. 

The time extension allows bundle commit messages to include a time property, defining when the bundle should be executed. 
The time extension defines two time-related fields in OFPBCT_COMMIT_REQUEST messages: 

• The time flag, denoted OFPBF_TIME. 

• The time property. 

All OFPT_BUNDLE_CONTROL messages include the OFPBF_TIME fiag. In control messages with type 
OFPBCT_COMMIT_REQUEST the time flag MAY be set, indicating that the time property field is present. The time 
property incorporates the time at which the switch is scheduled to apply the bundle. 

Control messages with a type that is not OFPBCT_COMMIT_REQUEST MUST have the OFPBF_TIME fiag disabled, and 
this fiag is ignored by the switch in these messages. 

1) The Time Flag 

This document updates ofp_bundle_f lags by adding the OFPBF_TIME fiag, as follows: 

/* Bundle configuration flags. */ 
enum ofp_bundle_flags { 

OFPBF_TIME ^ 1 « 2, /* Execute in a specific time. */ 

}; 


2) The Bundle Time Property 

This document defines a new bundle property, the time property. 

/* Bundle property */ 
struct ofp_bundle_prop_time { 

uintl6_t type; /* OFPBPT_TIME */ 
uintl6_t length; /* Length in bytes = 24 */ 
uint8_t pad[4]; 

struct ofp_time scheduled_time; /* The scheduled time at which the switch should apply the bundle. */ 

}; 

OFP_ASSERT(sizeof(struct ofp_bundle_prop_time) == 24); 

The type field in the time property is set to the value OFPBPT_TIME, defined as follows: 

/* Bundle property types. */ 
enum ofp_bundle_prop_type { 

OFPBPT_TIME =1, /* Time property. */ 
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}; 

3) Time Format 

The time format defined in this extension is based on the one defined in [5]. It consists of two sub-fields; a seconds field, 
representing the integer portion of time in seconds^^, and a nanoseconds field, representing the fractional portion of time 
in nanoseconds, i.e., 0 < nanoseconds < (10^ — 1). 

/* Time format */ 
struct ofp_time { 
uint64_t seconds; 
uint32_t nanoseconds; 
uint8_t pad[4]; 

}; 

OFP_ASSERT(sizeof(struct ofp_time) == 16); 

As defined in [5], time is measured according to the International Atomic Time (TAI) timescale. The epoch is defined as 1 
January 1970 00:00:00 TAI. 


D. Bundle Features Request 

The bundle features request defined in this document allows a controller to query a switch about its bundle capabilities, 
including its scheduled bundle capabilities. 

This section extends Section 7.3.5 of [19]. The reader is assumed to be familiar with Section 7.3.5 of [19]. 

The bundle features request is a new multipart message type, the OFPMP_BUNDLE_FEATURES message. This document 
updates ofp_multipart_type by adding the OEPMP_BUNDLE_EEATURES type, as follows: 

enum ofp_multipart_type { 

/* Bundle features. 

* The request body Is ofp_bundle_features_request. 

* The reply body Is struct ofp_bundle_features. */ 

OFPMP_BUNDLE_FEATURES = 17, 


1) Bundle Features Request Message Format 

The body of the bundle features request message is defined by struct ofp_bundle_f eatures_request, as follows: 

/* Body of OFPMP_BUNDLE_FEATURES request. */ 
struct ofp_bundle_features_request { 

ulnt32_t feature_request_flags; /* Bitmap of "ofp_bundle_feature_flags". */ 

ulnt8_t pad[4]; 

/* Bundle features property list - 0 or more. */ 
struct ofp_bundle_features_prop_header properties[0]; 

}; 

OFP_ASSERT(sizeof(struct ofp_bundle_features) == 8); 

The body consists of a fiags field, followed by zero or more property TLV fields. The fiags field, 
feature_request_f lags, is defined as follows: 

/* Flags used In a OFPMP_BUNDLE_FEATURES request. */ 
enum ofp_bundle_feature_flags { 

OFPBF_TIMESTAMP = 1 << 0, /* When enabled, the current request Includes a timestamp, using 

* the time property */ 

OFPBF_TIME_SET_SCHED =!<<!, /* When enabled, the current request Includes the sched_max_future 

* and sched_max_past parameters, using the time property */ 

}; 


If at least one of the fiags OFPBF_TIMESTAMP or OFPBF_TIME_SET_SCHED is set, the bundle features request includes 
a time property. 

The bundle features properties are specified below. 

2) Bundle Features Reply Message Format 

If the features request is processed successfully by the switch, it sends a reply to the controller. The body of the bundle 
features reply message is struct ofp_bundle_features, as follows: 

/* Body of reply to OFPMP_BUNDLE_FEATURES request. */ 
struct ofp_bundle_features { 

ulntl6_t capabilities; /* Bitmap of "ofp_bundle_flags". */ 

^^The seconds field in IEEE 1588 is 48 bits long. The seconds field used in this extension is a 64-bit field, but it has the same semantics as the seconds 
field in the IEEE 1588 time format. 
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uint8_t pad[6]; 

/* Bundle features property list - 0 or more. */ 
struct ofp_bundle_features_prop_header properties[0]; 

}; 

OFP_ASSERT(sizeof(struct ofp_bundle_features) == 8); 

3) Bundle Features Properties 

The optional property fields are defined as TLVs with a common header format, as follows: 

/* Common header for all bundle feature Properties */ 
struct ofp_bundle_features_prop_header { 
uintl6_t type; /* One of OFPTMPBF_*. */ 

uintl6_t length; /* Length in bytes of this property. */ 

}; 

OFP_ASSERT(sizeof(struct ofp_bundle_features_prop_header) == 4); 

The currently defined types are as follows: 

/* Bundle features property types. */ 
enum ofp_bundle_features_prop_type { 

OFPTMPBF_TIME_CAPABILITY = 0x1, /* Time feature property. */ 

OFPTMPBF_EXPERIMENTER = OxFFFF, /* Experimenter property. */ 

}; 


The Bundle Features Time Property. 

A bundle feature request in which at least one of the fiags OFPBF_TIMESTAMP or OFPBF_TIME_SET_SCHED is set, 
incorporates the time property. A bundle feature reply that has the OEPBE_TIME fiag set incorporates the time property. 

The time property is defined as follows: 

struct ofp_bundle_features_prop_time { 

uintl6_t type; /* OFPTMPBF_TIME_CAPABILITY. */ 
uintl6_t length; /* Length in bytes of this property. */ 
uint8_t pad[4]; 


struct ofp_time sched_accuracy; 

struct ofp_time sched_max_future; 
struct ofp_time sched_max_past; 
struct ofp_time timestamp; 


/-k The scheduling accuracy, i.e., how accurately the switch can 

* perform a scheduled commit. This field is used only in bundle 

* features replies, and is ignored in bundle features requests. */ 
/-k The maximal difference between the 

* scheduling time and the current time. */ 

/-k If the scheduling time occurs in the past, defines the maximal 
■k difference between the current time and the scheduling time. */ 
/* Indicates the time during the transmission of this message. */ 


OFP_ASSERT(sizeof(struct ofp_bundle_features_prop_time) == 72); 

The time property in a bundle features request includes: 

• sched_accuracy: this field is relevant only to bundle features replies, and the switch must ignore this field in a bundle 
features request. 

• sched_max_f uture and sched_max_past : a switch that receives a bundle features request with 
OEPBE_TIME_SET_SCHED set, should attempt to change its scheduling tolerance values according to the 
sched_max_future and sched_max_past values from the time property. If the switch does not successfully 
update its scheduling tolerance values, it replies with an error message. 

• timestamp, indicating the controller’s time during the transmission of this message. A switch that receives a bundle 
features request with OEPBE_TIMESTAMP set, may use the received timestamp to roughly estimate the offset between 
its clock and the controller’s clock. 

The time property in a bundle features reply includes: 

• sched_a ecu racy, indicating the estimated scheduling accuracy of the switch. For example, if the value of 
sched_accuracy is 1000000 nanoseconds (1 ms), it means that when the switch receives a bundle commit scheduled 
to time Tg, the commit will in practice be invoked at zt 1 ms. The factors that affect the scheduling accuracy are 
discussed in Section A-B. 

• sched_max_future and sched_max_past, containing the scheduling tolerance values of the switch. If the 
corresponding bundle features request has the OEPBE_SET_TIME_TOLERANCE fiag enabled, these two fields are 
identical to the ones sent be the controller in the request. 

• timestamp, indicating the switch’s time during the transmission of this feature reply. Every bundle feature reply that 
includes the time property also includes a timestamp. The timestamp may be used by the controller to get a rough estimate 
of whether the switch’s clock is synchronized to the controller’s. 
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E. Errors 

As defined in Section 7.5.4 of [19] the switch can send an error message to the controller, which includes a type and a 
code. This document extends Section 7.5.4 with additional codes, as specified below. 

1) Bundle Error 

When the switch has an error related to the bundle operation, it sends an error message with type OFPET_BUNDLE_FAILED. 
This document defines the following new codes: 

• OEPBEC_SCHED_NOT_SUPPORTED - this code is used when the switch does not support scheduled bundle execution 
and receives a commit message with the OFPBF_TIME flag set. 

• OFPBFC_SCHED_FUTURE - used when the switch receives a scheduled commit message and the scheduling time exceeds 
the sched_max_future (see Section A-B). 

• OFPBFC_SCHED_PAST - used when the switch receives a scheduled commit message and the scheduling time exceeds 
the sched_max_past (see Section A-B). 

The ofp_bundle_failed_code is updated as follows: 

enum ofp_bundle_failed_code { 

OFPBFC_SCHED_NOT_SUPPORTED = 16, /* Scheduled commit was received and scheduling is not supported. */ 

OFPBFC_SCHED_FUTURE = 17, /* Scheduled commit time exceeds upper bound. */ 

OFPBFC_SCHED_PAST = 18, /* Scheduled commit time exceeds lower bound. */ 

}; 


2) Bundle Eeatures Error 

When the switch has an error related to the OFPMP_BUNDLE_FEATURES request, it replies with an error mes¬ 
sage of type OFPET_BAD_REQUEST. The code OFPBRC_MULTIPART_BAD_SCHED indicates that the request had the 
OFPBF_SET_TIME_TOLERANCE flag enabled, and the switch failed to update the scheduling tolerance values. 

The ofp_bad_request_code is updated as follows: 

enum ofp_bad_request_code { 

OFPBRC_MULTIPART_BAD_SCHED =16, /* Switch received a OFPMP_BUNDLE_FEATURES request and failed 

* to update the scheduling tolerance. */ 

}; 



